Julia Lesson

By Andrew Ma and Luke Miller

What is Julia?

Julia is a high-level, high-performance dynamic programming language that looks like Ruby/Python syntax meets MatLab. It is meant to bridge the gap for mathematics and programming while also being very efficient at crunching numbers. Most of Julia's base library is written in Julia (woo metaprogramming).



In [1]:

    
# variable
x = 10
println(x)

# super hard math
y = x + 1
println(y)

# reassigning a variable
x = x + 1
println(x)

# unicode names
δ = 0.00001
println(δ)

안녕하세요 = "Hello"
println(안녕하세요)









    



10
11
11
1.0e-5
Hello

Stylistic Conventions:

Names of variables are in lower case.
Word separation can be indicated by underscores ('_'), but use of underscores is discouraged unless the name would be hard to read otherwise.
Names of Types and Modules begin with a capital letter and word separation is shown with upper camel case instead of underscores.
Names of functions and macros are in lower case, without underscores.
Functions that write to their arguments have names that end in !. These are sometimes called “mutating” or “in-place” functions because they are intended to produce changes in their arguments after the function is called, not just return a value.

Numeric Types



In [2]:

    
# Overflow example
x = typemax(Int64)
println(x)
println(x+1)









    



9223372036854775807
-9223372036854775808



In [3]:

    
# Coefficients
x = 3
println(2x^2 - 3x + 1)
println(1.5x^2 - .5x + 1)
println(2^2x)



In [4]:

    
# Zero and One operators

println(zero(1.0))
println(one(0))

Mathematical Operations



In [5]:

    
# Char
a = 'a'
println(a)

# String
string = "I'm a string"
println(string)









    



a
I'm a string



In [6]:

    
# Functions
function e(x,y)
    x+y
end

function e2(x,y)
    x+y, x-y
end

f(x,y) = x + y
g = f 

∑(x,y) = x + y

println(e(1,2))
println(e2(1,2))
println(f(1,2))
println(g(1,2))
println(∑(1,2))



In [7]:

    
# Functions continued
println(+(1, 2, 3))
h = +
println(h(1,2,3))

println(map(x -> x^2 + 2x - 1, [1,3,-1]))

bar(a,b,x...) = (a,b,x)
println(bar(1,2,3,4,5,6))

function optionalArg(x,y,z=0)
    x+y+z
end

println(optionalArg(1,2))
println(optionalArg(1,2,3))









    



6
6
[2,14,-2]
(1,2,(3,4,5,6))
3
6



In [8]:

    
# Scope
module A
a = 1 # a global in A's scope
end

module B
# b = a # would error as B's global scope is separate from A's
    module C
    c = 2
    end
b = C.c # can access the namespace of a nested global scope
        # through a qualified access
import A # makes module A available
d = A.a
# A.a = 2 # would error with: "ERROR: cannot assign variables in other modules"
end









    Out[8]:





B



In [9]:

    
# Method
k(x::Number, y::Number) = 2x - y;
println(k(1,2))



In [10]:

    
# Things with types and arrays

num = 12
println(typeof(num))
println(convert(UInt8, num))

numArray = Any[1 2 3; 4 5 6]
println(typeof(numArray))
println(numArray)
convert(Array{Float64}, numArray)

# Define your own conversion
import Base.convert
convert(::Type{Bool}, x::Real) = x==0 ? false : x==1 ? true : throw(InexactError())
println(convert(Bool, 1))
println(convert(Bool, 0))









    



Int64
12
Array{Any,2}
Any[1 2 3; 4 5 6]
true
false

Type System and Polymorphism

Dynamic, with some of the advantages of static typings! You can add type annotations that tell the compiler what concrete type a



In [32]:

    
1+2









    Out[32]:





3



In [11]:

    
(1+2)::AbstractFloat









    



TypeError: typeassert: expected AbstractFloat, got Int64



In [12]:

    
(1+2)::Int









    Out[12]:





3

Julia has a nice way to call a different method based on what types are passed into it: multiple dispatch Julia determines which function to dispatch the request to at run-time.

Example function headers: function collide(me::Circle, other::Rectangle) function collide(me::Polygon, other::Circle) function collide(me::Polygon, other::Rectangle)

Then when you call collide(me, other) it dispatches it to the correct method



In [13]:

    
type Point
    x::Float32
    y::Float32
end

type Vector2D
    x::Float32
    y::Float32
end

type UnitVector2D
    x::Float32
    y::Float32

    UnitVector2D(v::Vector2D) = (len = norm(v); new(v.x/len, v.y/len))
end



In [14]:

    
#Union Types:
VecOrUnit = Union{Vector2D, UnitVector2D}
dot(u::VecOrUnit, v::VecOrUnit) = u.x*v.x + u.y*v.y









    Out[14]:





dot (generic function with 1 method)



In [15]:

    
# Generate random 4x4 array
randomArray = rand(4,4)









    Out[15]:





4×4 Array{Float64,2}:
 0.277027  0.0456322  0.21335   0.574605 
 0.205414  0.0257954  0.719418  0.284063 
 0.548858  0.891233   0.172288  0.0619572
 0.810177  0.959946   0.841337  0.317503



In [16]:

    
# Broadcasting allows for the easy element-by-element binary operation on arrays
broadcast(+, randomArray, randomArray)









    Out[16]:





4×4 Array{Float64,2}:
 0.554053  0.0912644  0.426701  1.14921 
 0.410828  0.0515908  1.43884   0.568127
 1.09772   1.78247    0.344576  0.123914
 1.62035   1.91989    1.68267   0.635006

Why use Julia?

Julia is fast! In the figure, the benchmarks times are relative to C, where C=1.0 You can even call C code directly if you need even more speed.



In [1]:

    
# Calling C code
t = ccall( (:clock, "libc"), Int32, ())
println(t)

path = ccall((:getenv, "libc"), Cstring, (Cstring,), "SHELL")
unsafe_string(path)









    



3665644






    Out[1]:





"/bin/bash"

Julia is designed for paralellization and does not impose any style of parallelization on its users.The following example demonstrates how to count the number of heads in a large number of coin tosses in parallel.



In [18]:

    
nheads = @parallel (+) for i=1:100000000
  rand(Bool)
end









    Out[18]:





49997452



In [19]:

    
@time nheads = @parallel (+) for i=1:100000000
  rand(Bool)
end









    



  3.035767 seconds (200.02 M allocations: 2.981 GB, 7.63% gc time)






    Out[19]:





49997575

DataFrames



In [2]:

    
using DataFrames



In [21]:

    
# DataArray
dv = @data([NA, 3, 2, 5, 4])
println(mean(dv))

println(mean(dropna(dv)))

convert(Array, dropna(dv))

println(dv)

# converting na's
dv = @data([NA, 3, 2, 5, 4])
println(convert(Array, dv, 11))









    



NA
3.5
[NA,3,2,5,4]
[11,3,2,5,4]



In [22]:

    
df = DataFrame(A = 1:10, B = ["M", "F", "F", "M", "F", "M", "F", "F", "M", "M"])









    Out[22]:




A B
1 1 M
2 2 F
3 3 F
4 4 M
5 5 F
6 6 M
7 7 F
8 8 F
9 9 M
10 10 M



In [23]:

    
println(head(df))
println(tail(df))
println(df[1:3, :])









    



6×2 DataFrames.DataFrame
│ Row │ A │ B   │
├─────┼───┼─────┤
│ 1   │ 1 │ "M" │
│ 2   │ 2 │ "F" │
│ 3   │ 3 │ "F" │
│ 4   │ 4 │ "M" │
│ 5   │ 5 │ "F" │
│ 6   │ 6 │ "M" │
6×2 DataFrames.DataFrame
│ Row │ A  │ B   │
├─────┼────┼─────┤
│ 1   │ 5  │ "F" │
│ 2   │ 6  │ "M" │
│ 3   │ 7  │ "F" │
│ 4   │ 8  │ "F" │
│ 5   │ 9  │ "M" │
│ 6   │ 10 │ "M" │
3×2 DataFrames.DataFrame
│ Row │ A │ B   │
├─────┼───┼─────┤
│ 1   │ 1 │ "M" │
│ 2   │ 2 │ "F" │
│ 3   │ 3 │ "F" │



In [24]:

    
describe(df)









    



A
Min      1.0
1st Qu.  3.25
Median   5.5
Mean     5.5
3rd Qu.  7.75
Max      10.0
NAs      0
NA%      0.0%

B
Length  10
Type    String
NAs     0
NA%     0.0%
Unique  2



In [25]:

    
println(mean(df[:A]))
println(median(df[:A]))



In [26]:

    
df2 = DataFrame(A = 1:4, B = randn(4))
println(df2)
colwise(cumsum, df2)









    



4×2 DataFrames.DataFrame
│ Row │ A │ B        │
├─────┼───┼──────────┤
│ 1   │ 1 │ -1.11442 │
│ 2   │ 2 │ -2.34239 │
│ 3   │ 3 │ -0.53598 │
│ 4   │ 4 │ 1.0139   │






    Out[26]:





2-element Array{Any,1}:
 DataArrays.DataArray{Int64,1}[[1,3,6,10]]                             
 DataArrays.DataArray{Float64,1}[[-1.11442,-3.45681,-3.99279,-2.97889]]

Example



In [3]:

    
dataframe = readtable("train.csv")
head(dataframe)









    Out[3]:




PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked
1 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.25 NA S
2 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38.0 1 0 PC 17599 71.2833 C85 C
3 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.925 NA S
4 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1 C123 S
5 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.05 NA S
6 6 0 3 Moran, Mr. James male NA 0 0 330877 8.4583 NA Q



In [4]:

    
dataframe[:familysize] = dataframe[:SibSp] + dataframe[:Parch]
head(dataframe)









    Out[4]:




PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked familysize
1 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.25 NA S 1
2 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38.0 1 0 PC 17599 71.2833 C85 C 1
3 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.925 NA S 0
4 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1 C123 S 1
5 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.05 NA S 0
6 6 0 3 Moran, Mr. James male NA 0 0 330877 8.4583 NA Q 0



In [5]:

    
dataframe[:Age] = convert(Array, dataframe[:Age], mean(dropna(dataframe[:Age])))
head(dataframe)









    Out[5]:




PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked familysize
1 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.25 NA S 1
2 2 1 1 Cumings, Mrs. John Bradley (Florence Briggs Thayer) female 38.0 1 0 PC 17599 71.2833 C85 C 1
3 3 1 3 Heikkinen, Miss. Laina female 26.0 0 0 STON/O2. 3101282 7.925 NA S 0
4 4 1 1 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 0 113803 53.1 C123 S 1
5 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.05 NA S 0
6 6 0 3 Moran, Mr. James male 29.69911764705882 0 0 330877 8.4583 NA Q 0



In [6]:

    
head(dataframe[dataframe[:Sex] .== "male", :])









    Out[6]:




PassengerId Survived Pclass Name Sex Age SibSp Parch Ticket Fare Cabin Embarked familysize
1 1 0 3 Braund, Mr. Owen Harris male 22.0 1 0 A/5 21171 7.25 NA S 1
2 5 0 3 Allen, Mr. William Henry male 35.0 0 0 373450 8.05 NA S 0
3 6 0 3 Moran, Mr. James male 29.69911764705882 0 0 330877 8.4583 NA Q 0
4 7 0 1 McCarthy, Mr. Timothy J male 54.0 0 0 17463 51.8625 E46 S 0
5 8 0 3 Palsson, Master. Gosta Leonard male 2.0 3 1 349909 21.075 NA S 4
6 13 0 3 Saundercock, Mr. William Henry male 20.0 0 0 A/5. 2151 8.05 NA S 0

Accessing available public classic datasets



In [7]:

    
using RDatasets
iris = dataset("datasets", "iris")
head(iris)









    Out[7]:




SepalLength SepalWidth PetalLength PetalWidth Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa



In [ ]:

	PassengerId	Survived	Pclass	Name	Sex	Age	SibSp	Ticket	Fare	Cabin	Embarked
1	1	0	3	Braund, Mr. Owen Harris	male	22.0	1	A/5 21171	7.25	NA	S
2	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Thayer)	female	38.0	1	PC 17599	71.2833	C85	C
3	3	1	3	Heikkinen, Miss. Laina	female	26.0	0	STON/O2. 3101282	7.925	NA	S
4	4	1	1	Futrelle, Mrs. Jacques Heath (Lily May Peel)	female	35.0	1	113803	53.1	C123	S
5	5	0	3	Allen, Mr. William Henry	male	35.0	0	373450	8.05	NA	S
6	6	0	3	Moran, Mr. James	male	NA	0	330877	8.4583	NA	Q

	SepalLength	SepalWidth	PetalLength	PetalWidth	Species
1	5.1	3.5	1.4	0.2	setosa
2	4.9	3.0	1.4	0.2	setosa
3	4.7	3.2	1.3	0.2	setosa
4	4.6	3.1	1.5	0.2	setosa
5	5.0	3.6	1.4	0.2	setosa
6	5.4	3.9	1.7	0.4	setosa